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(54) FEATURE SELECTION METHOD 

(57)Abstract: 

PURPOSE: To provide the feature selection method capable of reducing 
the arithmetic amount of the product-sum operation of the distance 
required to decide the class to which the pattern belongs. 
CONSTITUTION: The feature vector is extracted from plural learning 
patterns (step 1). The approximate degree is obtained from the statistic 
index (step 2). The selection reference of the feature quantity to be selected 
is decided (step 3). The feature vector of the unknown pattern is extracted 
(step 4). The feature vector of the unknown pattern which is extracted in 
the step 4 is selected based on the selected reference (step 5). The low 
dimensional feature vector is stored (step 6) and the class to which the 
unknown pattern belongs is decided (step 7). 
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Japan Patent Office is not responsible for any 
damages caused by the use of this translation. 

1 . This document has been translated by computer.So the translation may not reflect the original precisely. 
2 **** shows the word which can not be translated. 
3. In the drawings, any words are not translated. 



CLAIMS 



[Claim(s)] 

[Claim 1] When determining the class to which the pattern by which vectorial representation was carried out with two or 
more characteristic quantity belongs In the feature-selection method of determining the class to which an effective thing is 
chosen in each feature expressed by the vector, a feature vector is formed into a low dimension, and a pattern belongs Extract 
a feature vector from two or more training patterns, and it asks for the distribution of the value for every dimension of the 
feature vector about all or some of two or more patterns to which each class belongs. It asks for order of approximation with 
this distribution close [ which ] to a normal distribution with one or more statistical indexes, this order of approximation 
independently Or the selection criterion of the characteristic quantity which should be chosen by combining and using is 
determined. The feature-selection method characterized by extracting the feature vector of a strange pattern, choosing the 
feature vector of the extracted strange pattern based on this selection criterion, and storing the low feature vector of the 
obtained number of dimension. 



[Translation done.] 
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Japan Patent Office is not responsible for any 
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2.**** shows the word which can not be translated. 
3. In the drawings, any words are not translated. 



DETAILED DESCRIPTION 



[Detailed Description of the Invention] 
[0001] 

[Industrial Application] this invention relates to the feature-selection method, in case it determines the class to which the 
pattern by which vectorial representation was especially carried out with two or more characteristic quantity belongs, chooses 
two or more features out of each feature expressed by the vector, and relates to the feature-selection method of determining a 
class by the vector of a low dimension. 
[0002] 

[Description of the Prior Art] In case the class to which the inputted strange pattern belongs conventionally is determined, the 
technique of compressing the number of dimension of the feature is known by performing principal component analysis 
using the covariance matrix of the feature vector obtained from two or more training patterns in process in which the distance 
of the dictionary and strange pattern which were created beforehand is found, performing space conversion, and using only a 
high order component with large characteristic value for distance calculation by the training pattern, 

[0003] That is, when there is generally m-dimensional feature- vector x=(xl, x2, , xm) t, discriminant- function D(X) 

=(X-mu) t S-l(X-mu)-ln |S| of BEIZU is [0004]. 
[Equation 1] 

D (X) = i^(0 ra ' (X-/0 ) 2 /X m -1. ISI 

[0005] It is equivalent, and by usually removing the low dimension component of the characteristic value lambda here, 
compression of the number of dimension of the feature in the space after conversion is aimed at, and the amount of 
operations is reduced. Here, for S, a covariance matrix and mu are [ a characteristic vector and lambda of the mean vector 
and phi ] characteristic value. 
[0006] 

[Problem(s) to be Solved by the Invention] However, in case the above-mentioned conventional technique performs 
dimension compression, for space conversion, as shown above (1), it needs to perform sum-of-products calculation, and has 
the problem that the amount of operations accompanying this calculation is large. 

[0007] this invention was made in view of the above-mentioned point, solves the above-mentioned conventional trouble, and 
aims at offering the feature-selection method which can lessen the amount of operations of sum-of-products calculation of 
the distance calculation at the time of determining the class to which a pattern belongs. 
[0008] 

[Means for Solving the Problem] Drawing 1 is principle explanatory drawing of this invention. 

[0009] When determining the class to which the pattern by which vectorial representation was carried out with two or more 
characteristic quantity belongs, this invention In the feature-selection method of determining the class to which an effective 
thing is chosen in each feature expressed by the vector, a feature vector is formed into a low dimension, and a pattern 
belongs Extract a feature vector from two or more training patterns (Step 1), and it asks for the distribution of the value for 
every dimension of the feature vector about all or some of two or more patterns belonging to each class. It asks for order of 
approximation with this distribution close [ which ] to a normal distribution with one or more statistical indexes (Step 2). 
order of approximation independently Or the selection criterion of the characteristic quantity which should be chosen by 
combining and using is determined (Step 3). The feature vector of the strange pattern which extracted the feature vector of a 
strange pattern (Step 4), and was extracted at Step 4 is chosen based on the selection criterion determined at Step 3 (Step 5). 
The low feature vector of the obtained number of dimension is stored (Step 6), and the class to which a strange pattern 
belongs is determined (Step 7). 
[0010] 

[Function] this invention performs distance calculation by the original feature space, not performing space conversion, when 
the distribution of the characteristic quantity for every dimension of a vector chooses only the thing near a normal 
distribution as an effective feature and uses the selected feature using the property in which the discernment precision by the 
discriminant function of BEIZU becomes the highest, when each characteristic quantity which constitutes a vector follows a 
normal distribution. Therefore, since the feature of the pattern expressed by the vector is chosen by the original feature space, 
as compared with the case where the feature number of dimension is compressed, the amount of operations decreases by 
space conversion. 
[0011] 

[Example] Hereafter, the example of this invention is explained with a drawing. 
[0012] Drawing 2 shows the system configuration of one example of this invention. 

[0013] I he system shown in this drawing consists of the feature-extraction section 10, the degree calculation section 1 1 of 
approximation, the selection-criterion determination section 12, the feature-selection section 13, and the low dimension-ized 
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• • 

feature storing section 14. 

[0014] Hereafter, operation of the above-mentioned composition is explained. 

[0015] Drawing 3 is a flow chart for explaining the outline of operation of one example of this invention. 

[0016] Step 1U hirst, the feature-extraction section 10 extracts characteristic quantity from two or more inputted training 

patterns, expresses it by the feature vector, and is inputted into the degree calculation section 1 1 of approximation. 

[0017] The degree of approximation with the degree calculation section 1 1 of step 1 1 approximation close [ the distribution 

of the characteristic quantity of each dimension expressed by the feature vector extracted at Step 1 1 / which ] to a normal 

distribution is calculated using statistical indexes, such as skewness and kurtosis, for every characteristic quantity. 

[0018] The step 12 selection-criterion determination section 12 determines independent or the selection criterion of the 

feature which should combine, should use and should be chosen for the order of approximation called for at Step 1 1 . [ two or 

more ] 

[0019] If a step 13 strange pattern is inputted into the feature-extraction section 10, characteristic quantity will be extracted, 
and it will express by the feature vector, and will input into the feature-selection section 13. 

[0020] The step 14 feature-selection section 13 chooses the feature vector of a strange pattern based on the selection criterion 
determined in the selection-criterion determination section 12 in the feature vector extracted at Step 13. 
[0021] The step 15 low dimension-ized feature storing section 14 stores the low feature vector of a number of dimension 
among the feature vectors of the strange pattern chosen in the feature-selection section 13. 

[0022] The processing for determining the selection criterion in the selection-criterion determination section 12 here is 
explained. The kurtosis, the skewness, and N training pattern characteristic quantity vector quantity which were asked for the 
selection-criterion determination section 12 in the degree calculation section 1 1 of approximation are inputted. Here, it is the 
characteristic quantity vector of N training patterns inputted XI = (xl 1, xl2, „.., xli, .., xlm) 
X2 = (x21, x22, x2i, .., x2m) 

XN = (xNl, xN2, xNi, .., xNm) 
When it carries out, it can ask for the skewness and kurtosis of the i-th feature by the following formulas respectively. 
[0023] 
[Equation 2] 



= (l/N) _p { (x,,-/i, ) /Si } 



• 

B« *=(1/N) _p { (Xu-itt, ) /Si } 4 

[0024] N shows the number of training patterns among the above-mentioned formulas, xij shows the i-th characteristic 
quantity of the j-th training pattern, and it is mui. The average of the i-th characteristic quantity of a training pattern is shown, 
and it is Si. The standard deviation of the i-th characteristic quantity of a training pattern is shown. 

[0025] When each element of a vector follows a normal distribution, skewness is set to 0.0 and kurtosis is set to 3.0. a center 
[ average ] -- carrying out ~ a distribution - the left -- a value negative in skewness when distorted -- taking - reverse ~ the 
right - it becomes a positive value in being distorted Moreover, it is known that acute [ of the center of a distribution ] will 
increase as kurtosis becomes large. 

[0026] When skewness and kurtosis are given, the selection-criterion determination section 12 calculates the value used as 
the criteria for determining the selection feature based on these values as a selection criterion, and outputs the feature number 
to choose to the feature-selection section 13. 

[0027] As criteria which choose the feature in the feature-selection section 13, it is the absolute value of a difference with the 
value of skewness [ in / - normal distribution / for example, ]. : a=|A-0.0| (3) 

And absolute value of a difference with the value of the kurtosis in - normal distribution: b=|B-3.0| (4) 

The feature of the fixed individual chosen from the direction with few values c which calculate them based on the following 

(5) formulas by carrying out linear combination of the values a and b in the feature of the fixed individual chosen from the 

smaller one of the sum of the ranking at the time of attaching ranking or (3), and (4) formulas from the one where the feature 

of the fixed individual chosen from the direction with few values at the time of ******(ing) and the value of a and b are 

smaller etc. can be considered. 

[0028] 

C=alphaa + betab (5) 

Here, alpha and beta are parameters with a suitable value. 

[0029] The feature-selection section 13 memorizes the selection feature number (selection criterion) determined by the 
selection-criterion determination section 12, and chooses the feature vector of the strange pattern actually inputted into the 
feature-extraction section 10 based on it. If the m-dimensional feature vector extracted from the strange pattern in the 
feature-extraction section 10 is inputted into this feature-selection section 13, the feature will be chosen in accordance with 
the selection criterion called for by the selection-criterion determination section 12, and a n-dimensional feature vector (m>n) 
will be outputted to the low dimension-ized feature storing section 14. 

[0030] The low dimension-ized feature storing section 14 stores the feature of the strange data formed into the low 
dimension by doing in this way. For example, a 8-dimensional vector as shown below is inputted, and it is Xin. = (xl, x2, x3, 
x4, x5, x6, x7, and x8) 
Xout=(xl ,x2 ,x5 ,x8 ) 

When memorizing in the feature-selection section 13 as a feature which 1, 2, and the 5 or 8th feature should choose, the 
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4-dimensional feature is outputted to the low dimension-ized feature storing section 14. 

[0031] The low dimension-ized feature storing section 14 stores the feature (xl, x2, x5, and x8). 

[0032] In addition, when a new training pattern is given, the feature can always be chosen in the optimal state by performing 

re-calculation of kurtosis and skewness, and reconfiguration of the feature decision criteria (selection criterion). 

[0033] As mentioned above, in accordance with a selection criterion, an effective thing is chosen from the features expressed 

by the vector of the inputted strange pattern, and the class to which this is formed into a low dimension and a strange pattern 

belongs is determined. 

[0034] 

[Effect of the Invention] As mentioned above, since the feature used in case the class classification of the vector data 
expressed with two or more characteristic quantity is carried out according to this invention is chosen by the original feature 
space, it is not necessary to perform space conversion and the amount of the sum-of-products operation of the distance 
calculation at the time of determining the class to which a strange pattern belongs can be lessened. 



[Translation done.] 
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TECHNICAL FIELD 



[Industrial Application] this invention relates to the feature-selection method, in case it determines the class to which the 
pattern by which vectorial representation was especially carried out with two or more characteristic quantity belongs, chooses 
two or more features out of each feature expressed by the vector, and relates to the feature-selection method of determining a 
class by the vector of a low dimension. 
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PRIOR ART 



[Description of the Prior Art] In case the class to which the inputted strange pattern belongs conventionally is determined, the 
technique of compressing the number of dimension of the feature is known by performing principal component analysis 
using the covariance matrix of the feature vector obtained from two or more training patterns in process in which the distance 
of the dictionary and strange pattern which were created beforehand is found, performing space conversion, and using only a 
high order component with large characteristic value for distance calculation by the training pattern. 

[0003] That is, when there is generally m-dimensional feature-vector x=(xl, x2, , xm) t, discriminant-function D(X) 

-(X-mu) t S-l(X-mu)-ln |S| of BEIZU is. [0004] 
[Equation 1] 

D (X) = ±\ (0 m * (X-/i) ) 2 /A a - 1 „ | S I 

m™ I 

[0005] It is equivalent, and by usually removing the low dimension component of the characteristic value lambda here, 
compression of the number of dimension of the feature in the space after conversion is aimed at, and the amount of 
operations is reduced. Here, for S, a covariance matrix and mu are [ a characteristic vector and lambda of an average vector 
and phi ] characteristic value. 
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EFFECT OF THE INVENTION 



[Effect of the Invention] As mentioned above, since the feature used in case the class classification of the vector data 
expressed with two or more characteristic quantity is carried out according to this invention is chosen by the original feature 
space, it is not necessary to perform space conversion and the amount of the sum-of-products operation of the distance 
calculation at the time of determining the class to which a strange pattern belongs can be lessened. 
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TECHNICAL PROBLEM 



[Problem(s) to be Solved by the Invention] However, in case the above-mentioned conventional technique performs 
dimension compression, for space conversion, as shown above (1), it needs to perform sum-of-products calculation, and has 
the problem that the amount of operations accompanying this calculation is large. 

[0007] this invention was made in view of the above-mentioned point, solves the above-mentioned conventional trouble, and 
aims at offering the feature-selection method which can lessen the amount of operations of sum-of-products calculation of 
the distance calculation at the time of determining the class to which a pattern belongs. 
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MEANS 



[Means for Solving the Problem] Drawing 1 is principle explanatory drawing of this invention. 

[0009] When determining the class to which the pattern by which vectorial representation was carried out with two or more 
characteristic quantity belongs, this invention In the feature-selection method of determining the class to which an effective 
thing is chosen in each feature expressed by the vector, a feature vector is formed into a low dimension, and a pattern 
belongs Extract a feature vector from two or more training patterns (Step 1), and it asks for the distribution of the value for 
every dimension of the feature vector about all or some of two or more patterns belonging to each class. It asks for order of 
approximation with this distribution close [ which ] to a normal distribution with one or more statistical indexes (Step 2). 
order of approximation independently Or the selection criterion of the characteristic quantity which should be chosen by 
combining and using is determined (Step 3). The feature vector of the strange pattern which extracted the feature vector of a 
strange pattern (Step 4), and was extracted at Step 4 is chosen based on the selection criterion determined at Step 3 (Step 5). 
The low feature vector of the obtained number of dimension is stored (Step 6), and the class to which a strange pattern 
belongs is determined (Step 7). 
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OPERATION 



[Function] this invention performs distance calculation by the original feature space, not performing space conversion, when 
the distribution of the characteristic quantity for every dimension of a vector chooses only the thing near a normal 
distribution as an effective feature and uses the selected feature using the property in which the discernment precision by the 
discriminant function of BEIZU becomes the highest, when each characteristic quantity which constitutes a vector follows a 
normal distribution. Therefore, since the feature of the pattern expressed by the vector is chosen by the original feature space, 
as compared with the case where the feature number of dimension is compressed, the amount of operations decreases by 
space conversion. 



[Translation done.] 
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EXAMPLE 



[Example] Hereafter, the example of this invention is explained with a drawing. 
[0012] Drawing 2 shows the system configuration of one example of this invention. 

[0013] The system shown in this drawing consists of the feature-extraction section 10, the order-of-approximation 
calculation section 1 1, the selection-criterion determination section 12, the feature-selection section 13, and the low 
dimension-ized feature storing section 14. 

0014] Hereafter, operation of the above-mentioned composition is explained. 

0015] Drawing 3 is a flow chart for explaining the outline of operation of one example of this invention. 

0016] Step 10 hirst, the feature-extraction section 10 extracts characteristic quantity from two or more inputted training 
patterns, expresses it by the feature vector, and is inputted into the order-of-approximation calculation section 11. 
"0017] The degree of approximation with the degree calculation section 1 1 of step 1 1 approximation close [ the distribution 
of the characteristic quantity of each dimension expressed by the feature vector extracted at Step 1 1 / which ] to a normal 
distribution is calculated using statistical indexes, such as skewness and kurtosis, for every characteristic quantity. 
[0018] The step 12 selection-criterion determination section 12 determines independent or the selection criterion of the 
feature which should combine, should use and should be chosen for the degree of approximation called for at Step 1 1 . [ two 
or more ] 

[0019] If a step 13 strange pattern is inputted into the feature-extraction section 10, characteristic quantity will be extracted, 
and it will express by the feature vector, and will input into the feature-selection section 13. 

[0020] The step 14 feature-selection section 13 chooses the feature vector of a strange pattern based on the selection criterion 
determined in the selection-criterion determination section 12 in the feature vector extracted at Step 13. 
[0021] The step 15 low dimension-ized feature storing section 14 stores the low feature vector of a number of dimension 
among the feature vectors of the strange pattern chosen in the feature-selection section 13. 

[0022] The processing for determining the selection criterion in the selection-criterion determination section 12 here is 
explained. The kurtosis, the skewness, and N training pattern characteristic quantity vector quantity which were asked for the 
selection-criterion determination section 12 in the degree calculation section 1 1 of approximation are inputted. Here, it is the 
characteristic quantity vector of N training patterns inputted XI = (xl 1, xl2, xli, .., xlm) 
X2 = (x21, x22, x2i, x2m) 

XN = (xNl, xN2, xNi, .., xNm) 
When it carries out, it can ask for the skewness and kurtosis of the i-th feature by the following formulas respectively. 
[0023] 
[Equation 2] 

A, *=(i/N) _p { (x,,-/i, ) /S, } 3 



2c8£ • 

B, '= (1/N) { (Xn-tf, ) /Si } 4 

[0024] N shows the number of training patterns among the above-mentioned formulas, xij shows the i-th characteristic 
quantity of the j-th training pattern, and it is mui. The average of the i-th characteristic quantity of a training pattern is shown, 
and it is Si. The standard deviation of the i-th characteristic quantity of a training pattern is shown. 

[0025] When each element of a vector follows a normal distribution, skewness is set to 0.0 and kurtosis is set to 3.0. a center 
[ average ] - carrying out « a distribution -- the left - a value negative in skewness when distorted - taking « reverse - the 
right -- it becomes a positive value in being distorted Moreover, it is known that acute [ of the center of a distribution ] will 
increase as kurtosis becomes large. 

[0026] When skewness and kurtosis are given, the selection-criterion determination section 12 calculates the value used as 
the criteria for determining the selection feature based on these values as a selection criterion, and outputs the feature number 
to choose to the feature-selection section 13. 

[0027] As criteria which choose the feature in the feature-selection section 13, it is the absolute value of a difference with the 
value of skewness [ in / - normal distribution / for example, ]. : a=|A-0.0| (3) 

And absolute value of a difference with the value of the kurtosis in - normal distribution: b=|B-3.0| (4) 

The feature of the fixed individual chosen from the direction with few values c which calculate them based on the following 

(5) formulas by carrying out linear combination of the values a and b in the feature of the fixed individual chosen from the 
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smaller one of the sum of the ranking at the time of attaching ranking or (3), and (4) formulas from the one where the feature 
of the fixed individual chosen from the direction with few values at the time of ******(ing) and the value of a and b are 
smaller etc. can be considered. 
[0028] 

C=alphaa + betab (5) 

Here, alpha and beta are parameters with a suitable value. 

[0029] The feature-selection section 13 memorizes the selection feature number (selection criterion) determined by the 
selection-criterion determination section 12, and chooses the feature vector of the strange pattern actually inputted into the 
feature-extraction section 10 based on it. If the m-dimensional feature vector extracted from the strange pattern in the 
feature-extraction section 10 is inputted into this feature-selection section 13, the feature will be chosen in accordance with 
the selection criterion called for by the selection-criterion determination section 12, and a n-dimensional feature vector (m>n) 
will be outputted to the low dimension-ized feature storing section 14. 

[0030] The low dimension-ized feature storing section 14 stores the feature of the strange data formed into the low 
dimension by doing in this way. For example, a 8-dimensional vector as shown below is inputted, and it is Xin. = (xl, x2, x3, 
x4, x5, x6, x7, and x8) 
Xout=(xl ,x2 ,x5 ,x8) 

When memorizing in the feature-selection section 13 as a feature which 1, 2, and the 5 or 8th feature should choose, the 

4-dimensional feature is outputted to the low dimension-ized feature storing section 14. 

[0031] The low dimension-ized feature storing section 14 stores the feature (xl, x2, x5, and x8). 

[0032] In addition, when a new training pattern is given, the feature can always be chosen in the optimal state by performing 
re-calculation of kurtosis and skewness, and reconfiguration of the feature decision criteria (selection criterion). 
[0033] As mentioned above, in accordance with a selection criterion, an effective thing is chosen from the features expressed 
by the vector of the inputted strange pattern, and the class to which this is formed into a low dimension and a strange pattern 
belongs is determined. 
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DESCRIPTION OF DRAWINGS 



[Brief Description of the Drawings] 

[Drawing 1] It is principle explanatory drawing of this invention. 

[Drawing 7] It is the block diagram of one example of this invention. 

[Drawing Tf It is a flow chart for explaining the outline of one example of this invention. 

[Description of Notations] 

10 Feature-Extraction Section 

1 1 Approximation Calculation Section 

12 Selection-Criterion Determination Section 

13 Feature-Selection Section 

14 The Low Dimension-ized Feature Storing Section 
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